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PRODUCTION AND USE OF TYPE 5 17BETA-HYDR0XYSTER0ID DEHYDROGENASE 

BACKGROUND OF THE INVENTION 
Field of the Invention 

5 

The present invention relates to the isolation and characterization of a novel 
enzyme which is implicated in the production of sex steroids, and more particularly, 
to the characterization of the gene and cDNA of a novel 20oc, 17p-hydroxysteroid 
dehydrogenase (hereinafter type S 17P-HSD) which has been implicated in the 
10 conversion of progesterone and 4-androstenedione (A^-dione) to 20oc- 
hydroxyprogesterone (lOoc-OH-P) and testosterone (T), respectively. The use of this 
enzyme as an assay for inhibitors of the enzyme is also described, as are several other 
uses of the DN A, fragments thereof and antisense fragments thereof. 

IS Description of the Related Art 

The enzymes identified as 17p-HSDs are important in the production of human 
sex steroids, including androst-S-ene-3p,17p-diol (A^iol), testosterone and estradiol. 
It was once thought that a single gene encoded a single type of 17p-HSD which was 

20 responsible for catalyzing all of the reactions. However, in humans, several types of 
np-HSD have now been identified and characterized. Each type of 17P-HSD has 
been found to catalyze specific reactions aiKi to be located in specific tissues. Further 
information about Types 1, 2 and 3 17p-HSD can be had by reference as follows: 
Type 1 np-HSD is described by Luu-The, V. et al., MoL Endocrinol., 3:1301-1309 

25 (1989) and by Peltoketo, H. et aK, FEES Lett, 239:73-77 (1988); Type 2 17P-HSD is 
described in Wu, L. et al., 7. Biol Chem, 268:12964-12969 (1993); Type 3 17P-HSD 
is described in Geissler, WM, Nature Genetics, 7:34-39 (1994). A fourth type which 
is homologous to porcine ovarian 17P-HSD has recently been identified by researchers 
Adamski and de Launoit, however, applicant is not presently aware of published 

30 information on this type. 

The present invention relates to a fifth type of 17P-HSD which is described in 
detail below. 
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SUMMARY OF THE INVENTION 

Ii is an object of the present invention to provide a novel 17p-hydroj^steroid 
5 dehydrogenase (17P-HSD) which is identified as type 5 17p-HSD. 

It is also an object of the present invention to provide a 17P-HSD which has 
been shown to be involved in the conversion of A*-dione to testosteix)ne and in the 
conversion of progesterone to 20«:.hydroxyprogesterone (20«:-OH-P). 

It is a further object of this invention to provide the nucleotide sequences and a 
10 gene map for type 5 17P-HSD. 

It is also an object of this invention to provide methods of using type 5 17p- 
HSD in an assay to identify compounds which inhibit the activity of Uus enzyme, and 
thus may reduce production of testosterone or 20cc-hydroxyprogesterone, and can be 
used to treat medical conditions which respond unfavorably to tiiese steroids. 
15 It is an additional object of tiiis invention to provide mediods of preventing the 

synthesis of type 5 17p-HSD by administering an antisense RNA of die gene sequence 
of type 5 np-HSD to interfere with the translation of the gene's mRNA. 

These and odier objects are discussed herein. 

In particular, a novel enzyme, type 5 17p-hydroxysteroid dehydrogenase, has 
been identified and characterized. The gene sequence for this type 5 Hp-HSD was 
found to encode a protein of 323 amino acids, having an apparent calculated molecular 
weight of 36.844 daltons. The protein is encoded by nucleotides +11 tiu-ough 982, 
including die stop codon (amino acids +1 tiirough 323), numbered in the 5' to 3' 
direction, in die following sequence (SEQ ID Nos. 1 and 2): 



20 



25 



30 



GTGACAGGGA ATG GAT TCC AAA CAG CAG TGT 6TA AAG CTA AAT GAT GGC 49 
Met Asp Ser Lys On Gin Cys Val Lys Leu Asn Asp Gly 
1 5 10 

CACnCATGCCTCTAnGGGATrTGGCACCTATGCACCTCCAGAGGn 97 
His Phe Mel Pre VH Leu Gly Phe Gly Thr Tyr Ala Pro Pn Gki Val 
15 20 25 

CCGAGAAGTAAAGCTnGGAGGTCACCAAATTAGCAATAGAAGCTGGG 145 
PiD Aig Ser Lys Ala Leu Gb Val Thr Lys Leu Ab He Ghi Ala Gly 
30 35 40 45 

no CGC CAT ATA GAT TCT CCl CAT HA TAG AAT AAT GAG GAG CAG GH 1 93 
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PheAigHis lleAspSerAlaHisLeuTyrAsnAsnGiuGluGinVal 
50 55 60 

GGACTGGCCATCCGAAGCAAGAnGCAGATGGCAGTGTGAAGAGAGAA 241 
S GlyLeuAlalleArgSerLystleAlaAspGlySerValLysAigGlu 
65 70 75 

GACATAnCTACACTTCAAAGCTTTGGTCCACTTnCATCGACCAGAG 289 
Asp Ub Phe Tyr Thr Ser Lys Leu Trp Ser Thr Phe His Arg Pro Ghi 
10 80 85 90 

nGGTCCGACCAGCCTTGGAAAACTCACTGAAAAAAGCTCAAnGGAC 337 
Leu Val A/g Pro Ala Leu Gbj Asn Ser Leu Lys Lys A)a Gin Leu Asp 
95 100 105 

15 

TATGnGACCTCTATCTTAnCATTCTCCAATGTCTCTAAAGCCAGGT 385 
Tyr Vai Asp leu Tyr Lai lie His Ser Pro Met Ser Leu Lys Pro Gty 
110 115 120 125 

20 GAGGAACnTCACCAACAGATGAAAATGGAAAAGTAATATnGACATA 433 
Glu Ghi Leu Ser Pro Thr Asp Ghj Asn Gly Lys Val lie Phe Asp lie 
130 135 140 

GTGGATCTCTGTACCACCTGGGAGGCCATGGAGAA6T6TAAGGATGCA 481 
25 ValAspleuCysThrThrTrpGhiAlaMetGluLysCysLysAspAla 
145 ISO 155 

GGAnGGCCAAGTCCAnGGGGTGTCAAACnCAACCGCAGGCAGCTG 529 
Gly Leu Ala Lys Ser lie Qy Val Ser Asn Phe Asn Arg Aig Gbi Leu 
30 160 165 170 

GAG ATG ATC CTC AAC AAG CCA GGA CTC AAG TAC AAG CCT GTC TGC AAC 577 
Ghi Mel He Leu Asn Lys Pro Gly Leu Lys Tyr Lys Pro Val Cys Asn 
175 180 185 

35 

CAGGTAGAATGTCATCCGTATTTCAACCGGAGTAAAUGCTAGATnC 625 
Gm Val Gki Cys His Pro Tyr Phe Asn Aig Ser Lys Leu Leu Asp Phe 
190 195 200 205 

40 TGCAAGTCGAAAGATATTGnCTGGTTGCCTATAGTGaCTGGGATCT 673 
Cys Lys Ser Lys Asp He ValLeu Val Ala Tyr Ser Ata Leu Gly Ser 
210 215 220 

CAACGAGACAAACGATGGGTGGACCCGAACTCCCCGGTGCTCTTGGAG 721 
45 GtaAfgAspLysAigTfpValAspProAsnSerProValLeuLeuGlu 
225 230 235 

GACCCAGTCCnTGTGCCHGGCAAAAAAGCACAAGCGAACCCCAGCC 769 
Asp Pro Vat Leu Cys Ala Leu Ala Lys Lys His Lys Aig Thr Pro Ala 
50 240 245 250 

CTG AH GCC CTG CGC TAC CAG CTG CAG CGT GGG GTJ GTG GTC CTG GCC 817 
Leu Oe Ala Leu Aig Tyr Gin Leu Gin Aig Gly Val Val Val Leu Ala 
255 a) 265 

55 

AAGAGCTACAATGAGCAGCGCATCAGACAGAACGTGCAGGnrrTGAG 865 
Lys Ser Tyr Asn Ghi Gin Arg Oe Aig Gin Asn Val Gin Val Phe Glu 
270 275 280 285 
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nC C AG TTG ACT GCA GAG GAC ATG AAA GCC ATA GAT GGC CTA GAC AGA 913 
Phe Gbi Leu Thr Ala Glu Asp Mel Lys Ala lie Asp Gly Leu Asp Aig 
290 295 300 

5 

AAT CTC CAC TAT TJl AAC AGT GAT AGT TH GCT AGC CAC CCT AAT TAT 96 1 
Asn Leu His Tyr Phe Asn Ser Asp Ser Phe Ala Ser His Pro Asn Tyr 
305 310 315 

1 0 CCA TAT TCA GAT GAA TAT TAA CATGGAGACT TTGCCTGATG ATGTCTACCA 1012 
Pro Tyr Ser Asp Glu Tyr • 
320 

GAAGGCCCTG TGTGTGGATG GTGACGCAGA GGACGTCTCT ATGCCGGTGA CTGGACATAT 1072 

13 

CACCTCTACT TAAATCCGTC CTGTnAGCG ACHCAGTCA ACTACA6CTC ACTCCATAGG 1 1 32 
CCAGAAATAC AATAAATCCT GrTTAGCGAC HCAGTCAAC TACAGCTCAC TCCATAGGCC 1 192 
20 AGAAATACAATAAA 1206 

In addition, a complete gene map (Figure 5) and nucleotide sequences (SEQ. ID Nos. 
3 through 29 and Figures 6A and 6B) of the chromosomal DNA of type 5 17P-HSD 
are provided. A more detailed description of the sequences wilt be provided infra. 

25 The present invention includes methods for the synthetic production of type 5 

17P-HSD, as well as peptides that are biologically functionally equivalent, and to 
methods of using these compounds to screen test compounds for their ability to inhibit 
or alter the enzymatic function. In addition, methods of producing antisense 
constructs to the type 5 17p-HSD gene's DNA or mRNA or portions thereof, and the 

30 use of those antisense constructs to interfere with the transcription or translation of the 
enzyme are also provided. 

The nucleotide sequence which encodes type 5 17P-HSD and recombinant 
expression vectors which include the sequence may be modified so long as they 
continue to encode a functionally equivalent enzyme. Moreover, it is contemplated, 

35 within the invention, that codons within the coding region may be altered, inter alia, 
in a manner which, given the degeneracy of the genetic code, continues lo encode the 
same protein or one providing a functionally equivalent protein. It is believed that 
nucleotide sequences analogous to SEQ ID No. 1, or those that hybridize under 
stringent conditions to the coding region of SEQ ID No. 1. are likely to encode a type 

40 5 17P-HSD functionally equivalent to that encoded by the coding region of SEQ ID 
No. 1, especially if such analogous nucleotide sequence is at least 700, preferably at 



SUBSTITUTE SHEET (RULE 26) 



wo 97/1 1162 PCT/CA96/00605 



-5- 

least 850 and most preferably at least 969 nucleotides in length. As used herein, 
except where otherwise specified, "stringent conditions" means O.lx SSC (0.3 M 
sodium chloride and 0.03M sodium citrate) and 0.1% sodium dodecyl sulphate (SDS) 
and 60*' C. 

S It is also likely that tissues or cells from human or non-human sources and 

which tissues or cells have the enzymatic machinery to convert AMione to 
testosterone, or to convert progesterone to 20Qc-'hydroxyprogesterone, include a type 5 
17P-HSD sufficiently analogous to human type 5 17P-HSD to be used in accordance 
with the present invention. In particular, cDNA libraries prepared from cells 

10 performing the foregoing conversions may be screening with probes in accordance 
with well known techniques prepared by reference to the nucleotides disclosed herein, 
and under varying degrees of suingency, in order to identify analogous cDNAs in 
other species. These analogous cDNAs are preferably at least 70% homologous to 
SEQ ID No. 1, more preferably at least 80% homologous, and most preferably at 

IS least 90% homologous. They preferably include stretches of perfect identity at least 
10 nucleotides long, more preferably stretches of IS, 20 or even 30 nucleotides of 
perfect identity, impropriate probes may be prepared from SEQ ID No. 1 or 
fragments thereof of suitable length, preferably at least IS nucleotides in length. 
Confumation with at least two distinct probes is preferred. Alternative isolation 

20 strategies, such as polymerase chain reaction (PGR) amplification, trmy also be used. 

Homologous type 5 17P-HSDs so obtained, as well as the genes encoding 
them, are used in accordance with the invention in all of the ways for using SEQ ID 
No. 2 and SEQ ID No. 1, respectively. 

Recombinant expression vectors can include the entire coding region for 

25 human type 5 17P-HSD as shown in SEQ ID No. I , the coding region for human type 
5 17P-HSD which has been modified, portions of the coding region for human type 5 
np-HSD, the chromosomal DNA of type 5 17p-HSD, an antisense construct to type 
5 np-HSD. or ponions of antisense constructs to type 5 17P-HSD. 

In the context of the invention, "isolated" means havmg a higher purity than 

30 exists in nature, but does not require purification from a namral source. Isolated 
nucleotides encoding type S Hp-HSD may be produced synthetically, or by isolating 
cDNA thereof from a cDNA library or by any of numerous other methods well 
understood in the art. 
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In one embodiment, the invention provides an isolated nucleotide sequence 
encoding type 5 17p-hydroxysteroid dehydrogenase, said sequence being sufficiently 
homologous to SEQ ID No. 1 or a complement thereof, to hybridize under stringent 
conditions to the coding region of SEQ ID No. 1 or a complement thereof and said 
5 sequence encoding an enzyme which catalyzes the conversion of progesterone to 20oc- 
hydroxyprogesierone and the conversion of 4-androstenedione to testosterone. 

In a further embodimeiu, the invention provides an isolated nucleotide 
sequence comprising at least ten consecutive nucleotides identical to 10 consecutive 
nucleotides in the coding region of SEQ ID No. 1, or die complement thereof. 
10 In an additional embodiment, the invention provides an oligonucleotide 

sequence selected from the group consisting of SEQ ID Nos. 30 through 59. 

In another embodiment, the invention provides a recombinant expression 
vector comprising a promoter sequence and an oligonucleotide sequence selected from 
the group of SEQ ID Nos. 30 to 59. 
15 In a further embodiment, the invention provides a method of blocking synthesis 

of type 5 np-HSD, comprising the step of introducing an oligonucleotide selected 
from the group consisting of SEQ ID Nos. 30 to 59 into cells. 

In an additional embodiment, the invention provides an isolated chromosomal 
DNA fragment which upon transcription and UMslation encodes type 5 17P- 
20 hydroxysteroid dehydrogenase and wherein said fragment contains nine exons and 
wherein said fragment includes introns which are 16 kilobase pairs in length. 

In another embodiment, the invention provides an isolated DNA sequence 
encoding type 5 17P-hydroxysteroid dehydrogenase, said sequence being sufficiendy 
homologous to SEQ ID No. 3 or a complement thereof, to hybridize under stringent 
25 conditions to SEQ ID No. 3, or its complement. 

In a further embodiiKnt, the invention provides a method for producing type 5 
17p-hydroxysteroid dehydrogenase, comprising the steps of preparing a recombinant 
host transformed or transfected with the vector of claim 3 and culturing said host 
under conditions which are conducive to the production of type 5 17P-hydroxysteroid 
30 dehydrogenase by said host. 

In an additional embodiment, the invention provides a method for determining 
the inhibitory effect of a test compound on the enzymauc activity of type 5 17P- 
hydroxysteroid dehydrogenase, comprising the steps of providing type 5 17P- 
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hydroxysteroid dehydrogenase; contacting said type 5 17p-hydroxysteroid 
dehydrogenase with said test compound; and thereafter determining the enzymatic 
activity of said type 5 17P-hydroxysteroid dehydrogenase in the presence of said test 
compoimd. 

5 In an additional embodiment, the invention provides a method of interfering 

with the expression of type 5 17p-hydroxysteroid dehydrogenase, comprising the step 
of administering nucleic acids substantially identical to at least 15 consecutive 
nucleotides of SEQ ID No. 1 or a complement thereof. 

In a further embodiment, there is provided a method of interfering with the 
10 synthesis of type 5 17p-hydroxysteroid dehydrogenase, comprising the step of 
administering antisense RNA complementary to mRNA encoded by at least 15 
consecutive nucleotides of SEQ ID No. 1 or a complement thereof. 

In an additional embodiment, the invention provides a method of interfering 
with the expression of type 5 17p-hydroxysteroid dehydrogenase, comprising the step 
15 of administering nucleic acids substantially identical to at least 15 consecutive 
nucleotides of SEQ ID No. 3 or a complement thereof. 

In another embodiment, the invention provides a method of interfering with the 
synthesis of type 5 17p-hydroxysteroid dehydrogenase, comprising the step of 
administering antisense RNA coin)lementaiy to mRNA encoded by at least 15 
20 consecutive nucleotides of SEQ ID No. 3 or a complement thereof. 

In a further embodiment, there is provided a method for determining the 
inhibitory effect of antisense nucleic acids on the enzymatic activity of type 5 17p- 
hydroxysteroid dehydrogenase, comprising the steps of providing a host system 
capable of expressing type 5 17p-hydroxysteroid dehydrogenase; introducing said 
25 antisense nucleic acids into said host system; and thereafter determining the enzymatic 
activity of said type 5 17p-hydroxysteroid dehydrogenase. 

Other features and advantages of the present invention will become apparent 
from the following description of the invention which refers to the accompanying 
drawings. 

30 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures lA and IB are graphs showing the enzymatic activities of Type 5 17p- 
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HSD on various substrates. The enzyme was expressed in embryonal kidney (293) 
cells (ATCC CRL 1573) which were transfected with a vector, prepared in accordance 
with the invention, and containing the gene encoding human type 5 17p-HSD. Figure 
, lA shows the substrate specificity of type 5 17P-HSD. The concentration of each 
5 substrate was 0.1 ^M, Figure IB shows the time course amount of 20oc-HSD and 
17p-HSD activities of cells transfected with vectors containing human type 5 17p- 
HSD. The substrates, progesterone (P) and AMione, were added at a concentration 
ofO.l/xM; 

10 Figure 2 is a map of a pCMV vector which is exemplary of one that can be 

used to transfect host cells in accordance with the invention; 

Figure 3 is the cDNA sequence (SEQ ID No. 1) and the deduced amino acid 
sequence (SEQ ID No. 2) of human type 5 17P-HSD. The nucleotide sequence is 
15 numbered in the 5' to 3' direction with the adenosine of the initiation codon (ATG) 
designated as +11. The translation stop codon is indicated by asterisks. The 
potential post modification sites are underlined, wherein TSK = tyrosine sulfokinase; 
CK2 = casein kinase 0; PKC = protein kinase C: NG = H-glycosylation; and NM 
= N-myrystoylation; 

20 

Figure 4 is a comparison of the deduced amino acid sequence of human type 5 
np-HSD to the amino acid sequences of rabbit (rb), rat (r), and bovine (b) 20oc-HSD 
as well as human (h) and rat (r) 3cx:-HSD, bovine prostaglandin f synthase (b pgfs) and 
frog p-crystallin (f p-crys). The amino sequences are indicated using the conventional 
25 single letter code and are numbered on the right. The dashes (-) and dots (.) indicate 
identical and missing amino acid residues, respectively; 

Figure 5 is a map of the chromosomal DNA of a gene which encodes type 5 
np-HSD; and 

30 

Figures 6A and 6B are the nucleotide sequence of the chromosomal DNA of a 
gene which encodes type 5 17P-HSD. 
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DETAILED DESCRIPTION OF THE INVENTION 

A gene encoding the enzyme, type 5 17P-HSD, has been isolated and encodes 
5 a protein having 323 amino acids with a calculated molecular weight of 36,844 
daltons. As shown in Figure 3, the coding portion of this gene includes nucleotides 
+ 11 through 982, including the stop codon (and encodes amino acids +1 through 
323), numbered in the 5* to 3' direction. 

The chromosomal DNA fragment of the gene for type 5 17P-HSD has also 

10 been characterized. A map of the gene is provided in Figure 5. In particular, it was 
found, using primer extension analysis, that the gene includes 16 kilobase pairs (kb) 
and contained nine shon exons. A portion of the 5' flanking region, as set forth in 
SEQ ID No. 3, of the genomic DNA includes 730 base pairs (bp). Exon 1 (SEQ ID 
No. 4) contains 37 nucleotides in the 5*-noncoding region and the nucleotides for the 

15 first 28 amino acids. The second intron region includes the nucleotides set forth in 
SEQ ID Nos. 5 and 6, which are 252 and 410 bp, respectively. These are joined by a 
1.2 kb region which is not important and therefore, its sequence has been omitted. 
Exon 2 (SEQ ID No. 7) contains nucleotides for the following 56 amino acids of 
human type 5 17P-HSD. The following intron region includes SEQ ID Nos. 8 and 9, 

20 700 and 73 bp, respectively, which are joined by a 0.1 kb region for which the 
sequence has not been provided. Exon 3 (SEQ ID No. 10) includes the next 117 
nucleotides which specify the following 39 amino acids. The founh intron region is 
represented by SEQ ID Nos. 11 and 12, 152 and 208 nucleotides in length, 
respectively, witii a 0.9 kb region in between which has not been provided. Exon 4 

25 (SEQ ID No. 13) includes die next 78 bp which specify the following 26 amino acids 
of the enzyme. Intron region five contains SEQ ID Nos. 14 and 15, with 98 and 249 
nucleotides, respectively, with a 0.1 kb region in the middle which has not been 
provided. The fifth exon (SEQ ID No. 16) contains nucleotides for the following 41 
amino acids of human type 5 17P-HSD. The sixth intron region, set forth in SEQ ID 

30 Nos. 17 and 18 with 138 and 189 bp, respectively, also includes a 2.8 kb region 
which has not been provided. Exon 6 (SEQ ID No. 19) contains nucleotides for the 
following 36 amino acids of type 5 17p-HSD, as well as two nucleotides of the codon 
227 (Trp). The next intron region includes a 136 bp portion (SEQ ID No. 20) and a 
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66 bp portion (SEQ ID No. 21) which are joined by a O.l kb region which is not set 
forth. Exon 7 (SEQ ID No. 22) contains nucleotides for the third nucleotide of codon 
227 (Trp) and nucleotides for the following 55 codons. The following intron region 
includes a 136 nucleotide region (SEQ ID No. 23). a 2.5 kb region which is not 
5 provided and a 286 bp region (SEQ ID No. 24). Exon 8 (SEQ ID No. 25) includes 83 
nucleotides which code for the following 27 amino acids and 2 nucleotides of codon 
310. The ninth iniron region contains 713 nucleotides (SEQ ID No. 26) followed by a 
1 kb region which has not been provided followed by a 415 nucleotide region (SEQ 
ID No. 27). Exon 9 (SEQ ID No. 28) contains the third nucleotide of codon 310, 42 

10 nucleotides for the last 13 amino acids and a stop codon and approximately 200 
nuclewides in the 3 '-untranslated region. A polymorphic (GT)„ repeat region that can 
be used to perform genetic linkage mappii^ of the type 5 17p-HSD can be found 255 
nucleotides downstream firom the TAA stop codon. SEQ ID No. 29 sets forth 109 bp 
of additional genomic sequence. The nucleotide sequence of the gene fragment, as 

15 described above, is provided in Figures 6A and 6B. 

The type 5 17P-HSD enzyme can be produced by incorporating the nucleotide 
sequence for the coding portion of the gene into a vector which is then ttansformed or 
transfected into a host system which is capable of expressing the enzyme. The DNA 
can be maintained transiently in the host or can be stably integrated into the genome of 

20 the host cell. In addition, the chromosomal DNA can be incorporated into a vector 
and transfected into a host system for cloning. 

In particular, for the cloning and expression of type 5 17P-HSD, any common 
expression vectors, such as plasmids, can be used. These vectors can be prokaryotic 
expression vectors including those derived from bacteriophage A. such as Xgtl 1 and 

25 ^£MBL3, E. coU strains such as pBR322 and Bluescript (Siratagene); or eukaryotic 
vectors, such as those in the pCMV family. VecK)rs incorporating an isolated human 
cDNA shown in Sequence ID No. 1 (ATCC Deposit No. ) and the chromosomal 
DNA as shown in Sequence ID Nos. 3 through 29 (ATCC Deposit No. ) for type 5 
17P-HSD have been placed on deposit at the American Type Culmre Collection 

30 (ATCC. Rockville. MD), in accordance with the terms of the Budapest Treaty, and 
will be made available to the public upon issuance of a patent based on the present 
patent application. 

These vectors generally ii^lude appropriate replication and control sequences 
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which are compatible with the host system into which the vectors are transfected. A 
promoter sequence is generally included. For prokaryotes, some representative 
promoters include p-lactamase, lactose, and tryptophan. In mammalian cells, 
commonly used promoters include, but are not limited to, adenovirus, 
5 cytomegalovirus (CMV) and simian virus 40 (SV40). The vector can also optionally 
include, as appropriate, an origin of replication, ribosome binding sites, RNA splice 
sites, polyadenylaiion sites, transcriptional termination sequences and/or a selectable 
marker. It is well understood that there are a variety of vector systems with various 
characteristics which can be used in the practice of the invention. A map of the 

10 pCMV vector, which is an example of a vector which can be used in the practice of 
the invention, is provided in Figure 2. 

Commonly known host systems which are known for expressing an enzyme, 
and which may be transfected with an appropriate vector which includes a gene for 
Type 5 17P-HSD can be used in the practice of the invention. These host systems 

15 include prokaryotic hosts, such as £. coli, bacilli, such as Bacillus subtilus, and other 
enterobacteria, such as Salmonella, Serraiia, and Pseudomorm species. Eukaryotic 
microbes, including yeast culmres, can also be used. The most common of these is 
Saccharomyces cerevisiae. although other species are commercially available and can 
be used. Furthermore, cell cultures can be grown which are derived from mammalian 

20 cells. Some examples of suitable host cell lines include embryonal kidney (293), SW- 
13, Chinese hamster ovary (CHO), HeLa, myeloma. Jurkat, COS-L BHK, W138 and 
madin-darby canine kidney (MDCK). In the practice of the invention, the 293 cells 
are preferred. 

Type 5 17p-HSD, whether recombinantly produced as described herein, 
25 purified from nature, or otherwise produced, can be used in assays to identify 
compounds which inhibit or alter the activity of the enzyme. In particular, since type 
5 17p-HSD is shown to catalyze the conversion of progesterone to 20oc.OH-P and the 
conversion of A^-dione to testosterone, this enzyme can be used to identify compounds 
which interfere with the production of these sex steroids. It is preferred diat the 
30 enzyme be obtained directly from the recombinant host, wherein following expression, 
a crude homogenate is prepared which includes the enzyme. A substrate of the 
enzyme, such as progesterone or A^-dione and a compound to be tested are then mixed 
with the homogenate. The activity of the enzyme with and without the test compound 
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is compared. Numerous methods are known which can be used to indicate the effects 
of the test conyjound on the activity of the substrate for easy detection of the relative 
amounts of substrate and product over time. For example, it is possible to label the 
substrate so that the label also stays on any product that is formed. Radioactive labels. 
5 such as C" or H\ which can be quantitatively analyzed are particularly useful. 

It is preferred that the mixmre of the enzyme, test compound and substrate be 
allowed to incubate for a predetermined amount of time. In addition, it is preferred 
that the product is separated from the substrate for easier analysis. A number of 
separation techniques are known, for exanq>le, thin layer chromatography (TLC). high 

10 pressure liquid chromatography (HPLC), spectrophotometry, gas chromatography, 
mass specttophoiometry and nuclear magnetic resonance (NMR). However, any 
known method which can differentiate between a substrate and a product can be used. 

It is also contemplated that the gene for type 5 17P-HSD or a ponion thereof 
can be used to produce antisense nucleic acid sequences for inhibiting expression of 

15 Type 5 Ivp-HSD in vivo. Thus activity of the en^rme and levels of its products (e.g. 
testosterone) may be reduced where desirable. In general, antisense nucleic acid 
sequences can interfere with transcription, splicing or translation processes. Antisense 
sequences can prevent transcription by forming a triple helix or hybridizing to an 
opened loop which is created by RNA polymerase or hybridizing to nascent RNA. 

20 On the other hand, splicing can advantageously be interfered with if die antisense 
sequences bind at the intersection of an exon and an intron. Finally, translation can be 
affected by blocking the binding of initiation factors or by preventing the assembly of 
ribosomal subunits at the stan codon or by blocking the ribosome from the coding 
ponion of the mRNA, preferably by using RNA that is antisense to the message. For 

25 further general information, see Hdene et al., Biochimica et Biophysica Aaa, 
1049:99-125 (1990), which is herein incorporated by reference in its entirety. 

An antisense nucleic acid sequence is an RNA or single stranded DNA 
sequence which is complementary to dtt target ponion of the target gene. These 
antisense sequences are introduced into cells where the complementary strand base 

30 pairs with the target portion of the target gene, thereby blocking the transcription, 
splicing or translation of the gene and elinunating or reducing the production of type 5 
np-HSD. The lengdi of the antisense nucleic acid sequence need be no more than is 
sufficient to interfere with the transcription, splicing or ttanslation of functional type 5 



SUBSTITUTE SHEET (RULE 26) 



wo 97/11162 PCT/CA96/00605 



-13- 

17P-HSD. Aittisense strands can range in size from 10 nucleotides to the complete 
gene, however, about 10 to 50 nucleotides are preferred, and 15 to 25 nucleotides are 
most preferred. 

Although it is contemplated that any portion of the gene could be used to 
5 produce antisense sequences, it is preferred that the antisense is directed to the coding 
portion of the gene or to the sequence around the translation initiation site of the 
mRNA or to a portion of the promoter. Some examples of specific antisense 
oHgonucleotide sequences in the coding region which can be used to block type 5 17p- 
HSD synthesis include: TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30); 

10 TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31); GATGAAAAGTGGACCA 
(SEQ ID No. 32); ATCTGTTGGTGAAAGTTC (SEQ ID No. 33); 
TCCAGCTGCCTGCGGT (SEQ ID No. 34); CTTGTACTTGAGTCCTG (SEQ ID 
No. 35); CTCCCiGTTGAAATACGGA (SEQ ID No. 36); 
CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37); 

15 TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38); ATCTGAATATGGATAAT 
(SEQ ID No. 39). Examples of antisense oligonucleotide sequences which can block 



the splicii^ of the type 5 


17P-HSD 


premessage 


are as 


follows: 


TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40); 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41); 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42); 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43): 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44); 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45); CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46); GGAAACTTACCTATCACTGT (SEQ ID No. 47); 

25 GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). Examples of antisense 
oligonucleotide sequences which inhibit the promoter activity of type 5 17p-HSd 
include: GAGAAATATTCATTCTG (SEQ ID No. 49); 

CGAGTCCTGATAAAGCTG (SEQ ID No. 50); GATGAGGGTGCAAATAA (SEQ 
ID No. 51); GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52); 

30 CAGAGATTACAAAAACAAT (SEQ ID No. 53); 

TGCCTTTTTACATTTTCAATCA (SEQ ID No. 54); ACACATAATTTAAAGGA 
(SEQ ID No. 55); TTAAATTATTCAAAAGG (SEQ ID No. 56); 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57): 
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CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58); CTGCCGTGATAATGCCCC 
(SEQ ED No. 59). 

As is well understood in the an. the oligonucleotide sequences can be modified 
in various manners in order to increase the effectiveness of the ireaiment with 
5 oligonucleotides. In panicular, the sequences can be modified to include additional 
RNA to the 3' end of the RNA which can form a hairpin-loop structure and thereby 
prevent degradation by nucleases. In addition, the chemical linkages in the backbone 
of the oiigomicleotides can be modified to also prevent cleavage by nucleases. 

There are numerous methods which are known in the art for introducing the 

10 antisense strands into cells. One strategy is to incorporate the gene which encodes 
type 5 np-HSD in the opposite orientation in a vector so that the RNA which is 
transcribed from the plasmkl is complementary to the mRNA transcribed from the 
cellular gene. A strong promoier, such as pCMV, is generally included in the vector, 
upstream of the gene sequence, so that a large amount of the antisense RNA is 

15 produced and is available for binding sense mRNA. The vectors are then transfected 
into cells which are then administered. It is also possible to produce single stranded 
DNA oligonucleotides or antisense RNA and incorporate these into cells or liposomes 
which are then administered. The use of liposomes, such as those described in 
WC)95/03788, which is herein incorporated by reference, is preferred. However, 

20 other methods which are well understood in the art can also be used to introduce the 
antisense strands into cells and to administer to these patients in need of such 
treatmem. 

The following is an example of the expression of human type 5 17p-HSD. 
This example is imended to be illustrative of the invention and it is well understood by 
25 those of skiU in the art that modifications, alterations and different techniques can be 
used within the scope of the invention. 

Expression of 
20oc, 17P-HSD (Type 5 17P-HSD) 

30 

Construction of the expression veaor and nucleotide sequence determimaion 

The phage DNA were digested with EcoRI restriction enzyme and the resulting 
cDNA fragments were inserted in the EcoRI site downstream to the cytomegalovirus 
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(CMV) promoter of the pCMV vector as shown in Figure 2. The recombinant pCMV 
plasmids were amplified in Escherichia coli DH5a competent cells, and were isolated 
using the alkaline lysis procedure as described by Maniatis in Molecular Cloning: A 
Laboratory Manual (Cold Spring Harbor Press 1982). The sequencing of double- 
5 stranded plasmid DNA was performed according to the dideoxy chain termination 
method described by Sanger F. et al., Proc. Natl. Acad, ScL, 74:5463-5467 (1977) 
using a T7 DNA polymerase sequencing kit (Pharmacia LKB Biotechnology). In 
order to avoid errors, all sequences were determined by sequencing both strands of the 
DNA. The oligonucleotide primers were synthesized using a 394 DNA/RNA 

10 synthesizer (Applied Biosystem). 

As shown in Figure 2, the pCMV vector contains 582 nucleotides of the 
pCMV promoter, followed by 74 nucleotides of unknown origin which includes the 
EcoRI and Hindm sites, followed by 432 basepairs (bp) of a small t intron (fragment 
4713 - 4570) and a polyadenylation signal (fragment 2825 - 2536) of SV40, followed 

15 by 156 nucleotides of unknown origin, followed by 1989 bp of the PvuII (628) to 
Aatll (2617) fragment from the pUC 19 vector (New England Biolabs) which contains 
an £. coli origin of replication and an ampicillin resistance gene for propagation in £. 
coli, 

20 Transient expression in transformed embryonal kidney (293) cells 

The vectors were iransfected using the calcium phosphate procedure described 
by Kingston. R,E., In: Current Protocols in Molecular Biology, Ausubel et al, eds., 
pp. 9.1.1 - 9.1.9, John Wiley & Sons, N.Y. (1987) and used 1 to 10 ng of 
recombinant plasmid DNA per 10^ cells. The total amount of DNA is kept at lOjig of 

25 plasmid DNA per 10* cells by completing with pCMV plasmid without insert. The 
cells were initially plated at 10* cells/cm' in Falcon® culture flasks and grown in 
Dulbecco^s modified Eagle*s medium containing 10% (vol/vol) fetal bovine serum 
(hyclone. Logan, UT) under a humidified aunosphere of air/CC^ (95%/5%) at 37*C 
and supplemented with 2 mM L-glutamine, 1 mM sodium pyruvate. 100 lU 

30 penicillin/ml, and 100 ^g streptomycin sulfate/ml. 

Assay of enzymatic activity 

The determination of enzymatic activity was performed as described by Luu- 
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The et al.. Biochemistry, 13:8861-8865 (1991) which is herein incorporated by 
reference. See also Lachance et al., J. Biol. Cfiem., 265:20469 - 20475 (1990). 
Briefly, 0.1 nM of the indicated '"C-labeled substrate (Duponc Inc. (Canada)), namely, 
dehydroepiandrosterone (DHEA), 4-androstene-3,17-dione (A*-dione), testosterone 
5 (T), estrone (El), estradiol (E2). dihydrotestosterone (DHT), and progesterone 
(PROG), was added to freshly changed culmre medium in a 6-well culture plate. 
After incubation for 1 hour, the steroids were extracted twice with 2 ml of ether. The 
organic phase was pooled and evaporated to diyness. The steroids were solubilized in 
50 Hi of dichloromethane. applied to a Silica gel 60 thin layer chromatography (TLC) 

10 plate (Merck, Darmstad, Gennany) and then separated by migration in the toluene- 
acetone (4:1) solvent system (Luu-The, V. et al., J. Invest. DemuaoL, 102:221-226 
(1994) which is herein incorporated by reference). The substrates and metabolites 
were identified by comparison with reference steroids, revealed by autoradiography 
and quanUtated using die Phosphoimager System (Molecular Dynamics, Sunnyvale, 

15 CA). 

Cloning of the type 5 17fi-HSD genomic DNA clone 

The hybridization and sequencing meUiods were as described above and as 
previously described (Luu-The et al., Mol. Endocrinol., 4:268-275 (1990); Luu-The 

20 et al., DNA and Cell Biol., 14:511-518 (1995); Lachance et al., J. Biol. Chem.. 
265:20469-20475 (1990); Lachance et al.. DNA and Cell Biol. 10:701-711 (I99I): 
Bemier et al., J. Biol. Chem, 269, 28200-28205, (1994) which are herein 
incorporated by reference). 

About 20 recombinant clones which gave the sorongest hybridization signal 

25 were selected for second and third screening in order to isolate a single phage plaque. 
The two longest clones that hybridized with specific oligonucleoades probes located 
at the 5' and 3' regions of type 5 17P-HSD. respectively, were selected for mapping, 
subcloning and sequencing. As shown in Figures 5 and 6, the gene is included in 
approximately 16 kilobase pairs of introns and conuins 9 short exons. A primer 

30 extension analysis using oligoprimer CAT-CAT-TTA-GCT-TTA-CAT-ACT-GCT-G 
located at positions 13 to 27. indicates that die start site is siniated 37 nucleotides 
upstream from the ATG initiating codon. 
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The sites and signatures in the primary protein sequence were detected using 
PC/Gene software (Intelli Genetics Inc., Mountain View, CA). This analysis revealed 
a potential N-glycosylation site at Asn-198; five protein kinase C sites at Ser-73, Thr- 
82, Ser-102, Ser-121, and Ser-221; five casein kinase II phosphorylation sites at Ser- 
5 129, Thr-146, Ser-221, Ser-271, and Thr-289; two N-myristoylation sites at Gly-158 
and Gly-298; a tyrosine sulfatation site at Tyr-55; an aldo/keto reductase fanuly 
signature 1 (25) at amino acids 158 to 168 and an aldo/keto reductase family putative 
active site signature at amino acids 262 to 280. 

As described above, the enzymatic activity of the type 5 17P-HSD was 

10 evaluated by transfeciing 293 cells with vectors which included the gene encoding 
human type 5 17P-HSD. The ability of the enzyme to catalyze the uransformation of 
progesterone (P) to 20cx-hydroxyprogesterone (20oc.OH-P), 4-androstenedione (A^- 
dione) to testosterone (T), 5oc-androstane-3,17-dione (A-dione) to dihydrotestosterone 
(DHT), dehydroepiandrosterone (DHEA) to 5-androstene-3p,17p-diol, and estrone 

15 (El) to estradiol (E2) was analyzed. As shown in Figure lA, die enzyme possesses 
high reductive 20x-HSD activity, wherein progesterone (P) is transformed to the 
inactive 20qc-OH-P, and 17p-HSD activity, wherein A^-dione is convened to 
testosterone (T). However, 3oc-HSD activity which is responsible for the 
transformation of DHT to 5a-androstane-3a,17p-dioI is negligible. The ability of this 

20 enzyme to transform El and E2 was also negligible (Figure lA). Figure IB shows 
that the 20oc-HSD and 17P-HSD activities increased over time. 

The isolated amino acid sequence of human type 5 17P-HSD was also 
compared with rabbit 20oc.HSD (rb), rat 20oc-HSD (r), human 3oc-HSD (h), rat 3«- 
HSD (r), bovine prostaglandin f synthase (b pgfs), frog p-crystallin (f p-crys) and 

25 human type 1 and type 2 17P-HSDs (h) as shown in Figure 4. These sequences show 
76.2%, 70.7%, 84.0%, 68.7%, 78.3%, 59.7%, 15.2% and 15.0% identity with type 
5 17P-HSD. respectively. 

Although the present invention has been described in relation to panicular 
embodiments thereof, many other variations and modifications and other uses will be 

30 apparent to those skilled in die art. 
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SEQUSMCE LISTING 



;i) GENERAL INFORMATION: 



(i) APPLICANT: LUO-THE, Van 

LABRIE, Fernand 

(iii) NUMBER OF SEQUENCES: 59 
(iv) CORRESPONDENCE ADDRESS- 

(D) STATE: NY 

(E) COUNTRY: US 

JF) ZIP: 10036-8403 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

25 JB) COMPUTER: IBM PC compatible 

fC) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Relefif if^?, version #1.30 

(vi) CURRENT APPLICATION DATA: 
APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION- 
^« ^A) NAME: Meilman, Edward 

" fB) REGISTRATION NUMBER: 24.735 

(C) REFERENCE/DOCKET NUMBER: P/1259-313 



(IX) TELECOMMUNICATION INFORMATION- 
(A) TELEPHONE: (212) 382-0700 
(B> TELEFAX: (212) 382-0888 
(C) TELEX: 236925 



:2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 1206 base pairs 

(B) TYPE: nucleic acid 
STRANDEDNESS: single 

^0 (D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 11.. 982 
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(xi) SEQUENCE DESCRIPTION.- SEQ ID N0:1: 

GTGACAGGGA ATC GAT TCC AAA CAG CAG TGT GTA AAG CTA AAT GAT GGC 
Met Asp Ser Lya Gin Gin Cys Val Lvs Leu Asn Asp Gly 
1 5 10 



"Zl Irl rl" f^'^ I" ™ TAC AAT AAT GAG GAG CAG GTT 

Phe Arg His He Asp Ser Ala His Leu Tyr Asn Asn Clu Glu Gin Val 

55 60 



49 



"Zi mI^ S^A TTG GSA TTT GGC ACC.TAT GCA CCT CCA GAG GTT 97 
10 " ^'^ ^" <^ly Thr Tyr Ala Pro Pro Glu Val 

" 13 20 25 

CCG AGA ACT AAA GCT TTG GAG GTC ACC AAA TTA GCA ATA GAA GCT GGG 
Pro Arg Ser Lys Ala Leu Glu Val Thr Lys Leu Ue 5?!; aII 1^ 
15 ^° " 40 45 



145 



193 



241 



^ S« f ?f tT*^ CGA AGC AAG ATT GCA gat GGC AGT GTG AAG AGA GAA 
Gly Leu Ala lie Azg Ser Lys lie Ala Asp Gly Ser Val Lys Arg Glu 
" 70 75 

25 T?- SI^ I" ^ '■CC ACT TTT CAT CGA CCA GAG 289 

^ Asp He Phe Tyr Thr Ser Lys Leu Trp Ser Thr Phe His Arg Pro Glu 
8° 85 90 

TTG GTC CGA CCA SCO TTG GAA AAC TCA CTG AAA AAA GCT CAA TTG GAC 337 
30 ^« ^~ ^" S« l-*" Lys Ala £4° X^p " 

TAT GTT GAC CTC TAT CTT ATT CAT TCT CCA ATS TCT CTA ue era isf-r <9ac 
Tyr Val Asp Leu Tyr Leu He His Ser ?S M« I" Su f ?S «J 
35 120 125 

GAG GAA CTT TCA CCA ACA GAT GAA AAT GGA AAA GTA ATA TTT GAC ATA 
Glu Glu Leu Ser Pro Thr Asp Glu Asn Gly ^ Val He ^ He 

135 140 

40 GTG GAT CTC TGT ACC ACC TGG GAG GCC ATG GAG AAG TGT AAG GAT GCA 
Val Asp Leu Cys Thr Thr Trp Glu Ala Net Glu Lys Cys Lys Sp m!, 

45 ^ nit 2?*^ ^ l^^ ?r '^^'^ TTC AAC CGC AGG CAG CTG 529 

Gly Leu Ala Lys Ser lie Gly Val Ser Asn Phe Asn Arg Arg Gin Leu 

165 170 



433 



481 



577 



625 



^ (S? tT^ CCA GGA CTC AAG TAC AAG CCT GTC TGC AAC 

50 1^1 '^'^ "-y^ fS2 ^" Tyr Lys Pro Val Cys Asn 

i'3 180 185 

CAG GTA GAA TGT CAT CCG TAT TTC AAC CGG AGT AAA TTG CTA GAT ttc 
Gin val Glu cys His Pro Tyr Phe Asn Arg Ser "u SIS Se 
55 "5 200 *^ 205 

TGC AAG TCG AAA GAT ATT GTT CTG GTT GCC TAT AGT GCT CTG GGA TCT 6T» 
Cys Lys Se: Lys Asp lie val Leu Val Ala Tyr Xll SS GlJ 5« " 

210 215 220 

^ ^ 2^- 1°*^ TCC CCG GTG CTC TTG GAG 721 

Gin Arg As? Lys Arg Trp Val Asp Pro Asn Ser Pro Val Leu Leu Glu 
225 230 235 

65 ^ S"? ?" l'^ *^ CAC AAG CGA ACC CCA GCC -"Sg 

03 Asp Pro Va. Leu Cys Ala Leu Ala Lys Lys His Lys Arg Thr Pro Ala 

245 250 
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30 (2) INFORMATION FX)R SEQ ID NO: 2: 



35 



U) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 324 amino acids 

(B) TYPE: amino acid 
(D) TOPOUXSY: linear 



(ii) MOLECULE TYPE: protein 

^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asp Ser Lya Gin Gin Cys Val Lys Leu Asn Asp Giy His Phe Met 
5 10 15 

45 ^'"^ Thr Tyr Ala Pro Pro Glu Val Pro Arg Ser 

25 30 

Lys Ala Leu Glu Val Thr Lys Leu Ala He Glu Ala Gly Phe Arg His 

40 45 

50 He Asp Ser Ala His Leu Tyr Asn Asn Glu Glu Gin Val Gly Leu Ala 

lie Azg ser Lys He Ala Asp Gly Ser Val Lys Arg Glu Asp He Phe 
55 '° ■'S 80 

Tyr Thr Ser Lys Leu Trp Ser Thr Phe His Arg Pro Glu Leu Val Arg 

90 95 

^ Pro Ala Leu Glu Asn Ser Leu Lys Lys Ala Gin Leu Asp Tyr Val Asp 

Leu Tyr Leu He His Ser Pro Met Ser Leu Lys Pro Gly Glu Glu Leu 

•lA- 120 125 

65 ser Pro Thr Asp Glu Asn Gly Lys Val He Phe Asp He Val Asp Leu 



817 



-20- 

CTG ATT GCC CTG CGC TAC CAG CTG CAG CGT GGG GTT GTG GTC CTG crr 
Leu He Ala Leu Arg Tyr Gin Leu Gin Arg gI? vJi vl? Val Ala 

260 265 

S5 ??? isj §K 1^ S5 is; SI? s; s SI 0^ 

S ?S S2 ^ £5 iS? SI S5 s ^ S 

AAT CTC CAC TAT TTT AAC AGT GAT AGT TTT GCT AGC CAC CCT AAT Tar 
Asn Leu His Tyr Phe Asn Ser Asp Ser Phe AU S^r "s Pro lyl 

310 315 ' 

S lyl ITr ^ lil ''^ TTGCCTGATG ATGTCTACCA 1012 

320 

GAAGGCCCTG TGTGTGGATG GTGACGCAGA GGACGTCTCT ATGCCGGTGA CTGGACATAT 1072 
CACCTCTACT TAAATCCGTC CTGTTTAGCG ACTTCAGTCA ACTACAGCTC ACTCCATAGG 
25 CCAGAAATAC AATAAATCCT GTTTAGCGAC TTCAGTCAAC TACAGCTCAC TCCATAGGCC 
AGAAATACAA TAAA 



961 



1132 
1192 
1206 



140 
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Cys Thr Thr Trp Glu Ala Met Glu Lys Cys lys Asp Ala Gly Leu Ala 
150 155 160 

Lys Ser lie Gly Val Ser Asn Phe Asn Arg Arg Gin Leu Glu Met He 
165 170 175 

Leu Asn Lys Pro Gly Leu Lys Tyr Lys Pro Val Cys Asn Gin Val Glu 
180 185 190 

Cys His Pro Tyr Phe Asn Arg Ser Lys Leu Leu Asp Phe Cys Lvs Ser 
195 200 205 

Lys Asp He Val Leu Val Ala Tyr Ser Ala Uu Gly Ser Gin Arg Asp 
210 215 220 

Lys Arg Trp Val Asp Pro Asn Ser Pro Val Leu Leu Glu Asp Pro Val 
225 230 235 240 

Leu Cys Ala Leu Ala Lys Lys His Lys Arg Thr Pro Ala Leu He Ala 
M 245 250 255 

Leu Arg Tyr Gin Leu Gin Arg Gly Val Val Val Leu Ala Lys Ser Tvr 
260 265 270 

25 Asn Glu Gin Arg He Arg Gin Asn Val Gin Val Phe Glu Phe Gin Leu 
275 280 285 

'^^^ '^P Asp Gly Leu Asp Arg Asn Leu His 

290 295 300 



10 



15 



30 



35 



40 



55 



65 



Tyr Phe Asn Ser Asp Ser Phe Ala Ser His Pro Asn Tyr Pro Tyr Ser 
305 310 315 320 

Asp Glu Tyr • 



(2) INEDRHATZON FOR SZQ ZD NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 730 base pairs 
(3) TYPE: nucleic acid 
:c: STRANDEDNESS: single 
iD) TOPOLOGY: linear 

45 (iil MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTZ-SENSE: NO 



(xi) SEQUENCE DESCRZPTZON: SEQ ZD NO: 3: 

AAGAACAAAT ACTATTAAGG CACTGCTTGC ATATATTAAA TGATGTCCAA ACTCCAAAAA 60 

CTGTTAATAA TTAACACTCC AATAAAAACT ACACCAGAAT TTCTTTTTAT TTGCACCCTC 120 

60 ATCAGGATTA CASCTTTATC AGGACTGCAT CTTCTTCAGA AATGAATATT TCTCTTACAA 180 

CGCAAAGAAA GAAAAATCAA AATAAATTTT CTGATTGAAA ATGTAAAAAG GCAAATATTT 240 

TTACACTTTT AACTTTAATT TTTTATTGAG GACCAACTGT TTGAAAAATT CTCATTAGTC 300 

ATTCCTTTAA ATTATGTGTA TGTGAGACAA AGACGTAAGA TGGTTAATTA TTTCAAATGA 360 
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TGCACTATAA AGAAGGGGCA TTATCACGGC AGAAACGAAA AAAGATATTT GTAGCTGGAG 
GTTTTTATAG TCTAACATAT GC?rTGCTATT TGTTCTACAA ATCCTTTTGA ATAATTTAAT 
5 ATAGAGATTT CGAATAGAAA ATAATACTTT AGATAGAAAT TAATGAGTTT ATTATAACCA 
TATATTATAA TAATTTACTT AGGAATTCTC TTTGATAAGA AACAAAT6AA CTGAATGCAA 
TTTTCTCCAC AGACCATATA AGACTGCCTA TGTACCtCCT CCTACATGCC ATTG6TTAAC 
CATCAGTCAG TTTGCAGGGG TGGGGGGAGG GGTTTCCTGC CCATTGTTTT TGTAATCTCT 
GAGGAGAAGC 
15 (2) INFORMATION FOR SEQ ID NO: 4: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 121 base pairs 
9n ry^Et nucleic acid 

«C) STRANDEDNESS: single 
iD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

25 (iii) HYPOTHETICAL: NO 

Civ) ANTI-SENSE: NO 

30 (ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 38.. 121 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 : 

AGCAGCAAAC ATTTGCTAGT CAGACAAGTG ACAGGGAATG GATTCCAAAC AGCAGTGTGT 
^ AAAGCTAAAT GATGGCCACT TCATGCCTGT ATTGGGATTT GGCACCTATG CACCTCCAGA 

G 

:2) INFORMATION FOR SEQ ID NO: 5: 

<i) SEQUENCE CHARACTERISTICS- 

;A) LENGTH: 252 base pairs 
(B) TYPE: nucleic acid 
iC) STRANDEDNESS: single 
5Q '3) TOPOLOGY: linear 



55 
60 



Ui) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) Ain'I-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 
GTAAGAATAA TTCCTTTTAG TTTTCGGATT TCAAAAGAAT AAACCTAGTA GAAGT6AAAC 
CCGTATTGGG TTGTAAGGTT CGTGTTCCTA CCTTACTCTG GATGACTCAC TG6TCTAGGT 
TTCCTACGCT AGGAGAAAAA AGTAGGCAAT CCTTGTTCTG CATTGAGGTC CATTCCTATG 



420 
480 
540 
600 
660 
720 
730 



60 
120 
121 



60 
120 
180 
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GTCACGTACT GCTTATTTTT CGTTTGTGCA CTGTTTCTTT CTTCTGTTCA TGTCTAGTTC 240 
CCAGCTTGGC AG 

252 

S (2) INFORMATION FOR SEQ 10 NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 410 base pairs 
1^ (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOUXSY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iii) HYPOTHETICAL: NO 

(ivj ANTI-SENSE: NO 

20 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
^ GGAAGTCTGA GTGAGCATTC TGTGTAATAT CACTGGGAGA GAACTCATAT GAGCTTGCAC 
CGTTTCCCTT CTATACTCCA TGTGATTTTT ACCATGTATA ATATCACTAT ATTAAAAATA 
ATTAGGACTA TTTCAGTCAT GTTAACTTTT CCAACAAATC ACTGAATCTG AGGGT6TTAT 180 
GTGGTACCTC CATAACAGTG ATCAACCAGA GATTGCCTGA GACTGAAGGT GTTTCTGGGA 240 
TGCTCAACCT TTATTACTAA CCAGGAAAGA CTCAGGCAAA CTGAGATGGA CTTTTCACCC 300 
CACATACAGA CAGGAGGAAA AGCTGATTCT TGTATAAAAG TCAATGCTTG TGCCTGAACT 360 
ACCTCTCAGC CACA6TGATC ACCAGATACT ACCTTTGGTT GCTCCTCCAG 410 
(2) INFORMATION FOR SEQ ID N0:7: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D» TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

50 (iv) ANTI-SENSE: NO 



30 



55 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..168 



60 
120 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 
60 GTTCCGAGAA GTAAAGCTTT GGAGGTCACA AAATTAGCAA TAGAAGCTGG GTTCCGCCAT 
ATAGATTCTG CTCATTTATA CAATAAT6AG GAGCAG6TTG GACTGGCCAT CCGAAGCAAG 
ATTGCAGATG GCAGTGTGAA GAGAGAAGAC ATATTCTACA CTTCAAAG I68 
<2) INFORMATION FOR SEQ ID N0:8r 



60 
20 
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10 



15 
20 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 700 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 



60 



GTACT6TGTC TATGATGAGC TTGTGTGCAC ATGTATTTAT TGTGATTGTG TGGAGATGAC 

AATTCTATGA CTGGATGAGT AGTTGTGGGT GAATTTTGCT TCTGGGTTCA AATTTATTCA 120 

CACATACTCA CATACTAAAA CTGAAATCAA AATCAAGGAA TGATGATCAC TTTTCATTTT 180 

^ GGCTGTGTTC CAATTTATGA CCTGAAAGTC CCTTTACTTT TTTGAGCTTC AGCCGAGATC 240 

AGTGTGATTT GACAT6TGCT ATAGAATCAC AGAGAACAAT AATCATGTTA TGGTTTTTCT 300 

TATCGCCTGG GTGATTTTCT AAGATTTCTT ATTATTCTCT CAATTGCTAT CTTTATCAGT 360 

GAGATAGAAA GCAATATAAG AAAGCTCTGG GAGTATTAAA TAATAGACAC TTAAATTGTC 420 

CTAAATTGTG TCCAGCATAG T6AGCATGTT CAAAACTTGT TTTACCCCCC TTTTATGTTG 480 

CTTTAGTTTC TAAGCAACAT AAATAGCTAT TCTTAAGCAT TGGGTTGAAT GGATAGAAGA 540 

ATTAGACT6T TAAAATGAGT T6TAAACTCT ACTGAAGATA ATTCAGGTAA CATCATAGTT 600 

ATTACTTAAT ACTAATCTTT ACATTTTAAG AATTTACTCC TATCATTCAG TAGATGTACA 660 

40 AACTATACAT CCAACGTATA ATAAAGTTTA TAAGGATAGG 7qO 
12) INFORMATION FOR SEQ ID NO: 9: 

AC SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ACTAGATGGC ACAAAGTAAT AAGATTTGCT CAAGCATTCA TTCAAAATCA CCTCCATTCT 
TTAACCTCTG CAG 
65 (2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 



60 
73 
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(A) LENGTH: 117 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
^ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

10 (iv> ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: exon 
13 (B) LOCATION: 1..117 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
20 CTTTGGTCCA CTTTTCATCG ACCAGAGTTG GTCCGACCAG CCTTGGAAAA CTCACTGAAA 60 
AAAGCTCAAT TGGACTATGT TGACCTCTAT CTTATTCATT CTCCAATGTC TCTAAAG 117 
2^ (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 152 base pairs 

(B) TYPE: nucleic acid 
_ (C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 



35 



40 



60 



65 



(ii) MOLECULE TYPE: DNA (genonic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTATGCAGTT TGTATGAGCA TAAAATTGC6 CTTCTGCTGT CATTATAAAC ATTGTTTATC 60 
45 TGGATAGTTG AACAGAGCTT TTTATTAGGA GGATGTAGGG ATTATCACAC AGAAGAAGAA 120 
CCGTAAGTG6 AACACCTAAT TTCCTTTCTT TC 152 
(2) INFORMATION FOR SEQ ID NO: 12 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20B base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTI(»I: SEQ ID NO: 12: 
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•26. 

ATATAATATT TGTAAGAGAT TAGAGGAAGC CTGTCTCCTG AATACATTCC TTATACCTTC 60 
ATATGTAAAA CACTTAGCAC ATATCACTTT CTGGAGCATT GTACCACCTG TCTCATGGAG 120 
3ATTAGTGTC CTTAAAG6TA CCTGGGGTTA CAGCTATGAG TGGAGAAATT AATTTGTGAC 180 
ATCATTAAAA TGACTGCTTC TATTTCAG 
:2) INFORMATION FOR SEQ ID N0:13: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 7Q base pairs 

(B) TYPE: nucleic acid 



208 



«C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

Hi) MOLECULE TYPE: DNA < genomic) 

(iii) HYPOTHETICAL: NO 

(iv» ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1,.78 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

CCAGGTGAGG AACTTTCACC AACAGATGAA AATGGAAAAG TAATATTTGA CATAGT6GAT 60 

rrCTGTACCA CCTGGGAG 
J3 .2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 base pairs 
^ (B) TYPE: nucleic acid 

^ (Q) STRANDEDNESS: single 

'D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
45 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 
3TGAGTGCTT GGCGGAGACG ACACAGAGAA GGATGACAAA AAGAGAAAAT CTGTTTCCCA 60 
3GTTCGATA6 GAAAGAATGG AATATGCACC ATTAGATC 

98 

2) INFORMATION FOR SEQ ID NO: 15: 

60 (i) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 24 9 base pairs 
?3) TYPE: nucleic acid 
:c> STRANDEDNESS: single 
;D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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tiii) HYPOTHETICAL; NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GACAGGAATC TCTTTCCTTG CTTGTGCATT AATCTATGCA GTTTCCTAAG GAAGAGATAG 60 
AAATTCTTAC TCTTGCTGCC TCTATCTTCT TCCCCTATTT GCTGTTTGAA TTTTTCTTTT 120 
TTTGACAATC ACTGCTAGCT ATTTTCATTG TCATACTTTG AAAGTTGTTG CTCTCACAGT 180 
TCTGTCTTGC ATTTACCGTG ATTTGCAGCC AACTGCACAA ATAATTCCTC ACAACCCCTT 240 
TCTCCACAG 

249 

20 (2) INFORMATICai FDR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 base pairs 

(B) TYPE: nucleic acid 
^ (C) STMNDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
30 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

35 (ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..123 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

GCCATGGAGA A6TGTAAGGA TGCAGGATTG GCCAAGTCCA TTGGGGTGTC AAACTTCAAC 60 
CGCAGGCAGC TGGAGATGAT CCTCAACAAG CCAGGACTCA AGTACAAGCC TGTCTGCAAC 120 
CAG 

123 

(2; INFORMATION FOR SEQ ID NO: 17: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 138 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDKESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

fiii) HYPOTHETICAL: NO 

60 (iv) ANTI -SENSE: NO 



65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GTGAGCTCCC TTGGCCTTCT CTCCTTTCGG TTCTTCATGC CCCCTCTTCC TGTCCTATTG 60 
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CCAAATATCT GTTTGTTTTG TCCCAGTTAT CTTTGTGAAG TAGAAGATTA TCTAGAGAGC 120 

AAAGCTTCTG TCAAGAAA 
5 138 
(2) INFORMATION FOR SZQ ID NOilB: 

li) SEQUENCE CHARACTERISTICS: 
Ift (A) LENGTH: 189 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



15 



20 



25 



(xil SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
ATTTCCATTT ATACTTTTAG AAGATATATA AAATTTATTT CTATGAAAAA GGTTATTACT 60 
TGACAATAAT ATCCTCAGCT CAAATATAAT 6CTATACT6A TTATTATTCA GCTTCCTTAC 120 
TTTCATCTTT TCAATATTAA CATAACTATT TCATATAAAT TGATGCTTCT CTCTTTTGGT 180 
CAACTGCAG 

189 

(2) INFORMATION FOR SEQ ID NO: 19: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 110 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
^ <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 

(iii) HYPOTHETICAL: NO 

45 (iv) ANTI-SENSE: NO 



50 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..110 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:19: 
55 GTAGAATGTC ATCCGTATTT CAACCGGAGT AAATTGCTAG ATTTCTGCAA GTCGAAAGAT 60 
ATTGTTCTGG TTGCCTATAG TGCTCTGGGA TCTCAACGAG ACAAACGATG no 
^ (2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
« (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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29- 



(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GTAATAAAAA CAATGGGACC TTTACATAAA CCTTCATTTT GCAGAAAATT TTTTAGTCAG 60 
AGCATCCTCA GTTTCCTGTA GTTAAGTTTC AAGTGGCTCA TGGAGAGGAA AGAGAATTGC 120 
15 GTTTCTGACG AGATCT 

(2) INFORMATION FOR SEQ ID NO: 21: 

„ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 



(xl) SEQUENCE DESCRIPTION: SEQ ID N0:21: 
TTTAGGGAGC TGCCTAACAA ACTATCGGCA GCCTCAGGGC CTCAGCCTTT CTGCCTTTCC 60 
TTCCAG 

40 (2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

fA) LENGTH: 166 base pairs 
fB) TYPE: nucleic acid 
CC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

55 (ix) FEATURE: 

(A) MAME/KEY: exon 

(B) LOCATION: 1..166 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

GGTGGACCCG AACTCCCCGG TGCTCTTGGA GGACCCAGTC CTTTGTGCCT TGGCAAAAAA 60 

GCACAAGCGA ACCCCAGCCC TGATTGCCCT GCGCTACCAG CTGCAGCGTG GGGTTGTGGT 120 

CCTGGCCAAG AGCTACAATG ACCAGCGCAT CAGACAGAAC GTGCAG 166 
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15 



20 



40 
45 



:2) INE10RMATI0N FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 base pairs 

(B) TYPE: nucieic acid 
to STRANDEONESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

<iv) ANTI*SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
3TGAGGAGCG GGGCTGTGGG CCTCAGGTCT CCTGCACAGT GTCCTTCACA CGTGTCCTTC 60 
TTGTAAGGCT CTCAGGACAG CCTTGGGCCA GCTCCATTTC CCTGTATTTC CCATATGAAT 120 
3CTTTGCGTG CATCCT 

25 136 
12) INFORMATION FOR SEQ ID NO: 24: 

(il SEQUENCE CHARACTERISTICS: 
an LENGTH: 286 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEI»9E5S: single 

(D) TOPOLOGY: linear 

2^ (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

rCCTATCATC TOGGCACAAT GTCAGCGCTG TTTCTTCTCC ATTTTCTGTT GAAATTTTCT 60 

::TTT6TCTGC AGAGTTGCAC AGTTTCAATA CATAATATCT AGGAATGGAT TTCTGCTTAT 120 

^ TTTTCGTGAG CTATTCATTG ACCCACCTGA GTGTTTAGAG CTGACTTCTA TAACTGTTTA 180 

AAACTTACCA ATATTTTAAG TATTGTCTCT GCACCCTACT GTCTAATATA CTTGGGGATT 240 

:ACAACTGGC AATCTAAAAA TAATAAAAGT TTTTTATTTC TGATAG 286 
55 .2) INFORMATION FOR SEQ ID MO:25: 

(i) SEQUENCE CHARACTERISTICS: 
A) LENGTH: 83 base pairs 
^ .3) TYPE: nucleic acid 

ci STRANDEDNESS: single 
O) TOPOLOGY: linear 

(ii) HCLECULE TYPE: DNA (genanic) 

65 (iii) HYPOTHETICAL: NO 

(iv) Arn'I-SENSE: NO 
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(ix) FEATURE: 

(A) NAME/KEY: exon 
5 (B) LOCATION: 1..83 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO:25: 
10 GTTTTTGAGT TCCAGTTGAC TGCAGAGGAC ATGAAAGCCA TAGATGGCCT AGACAGAAAT 60 
CTCCACTATT TTAACAGTGA TAG 83 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 713 base pairs 

{B) TYPE: nucleic acid 
„ (C) STRANDEDNESS: Single 

20 (D) TOPOLOGY: linear 

(ii) riOLECULE TYPE; DNA (genomic) 

(iii) HYPOTHETICAL: NO 

<iv) ANTI-SENSE: NO 



25 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

GTAAGTTTCC TTTGTAAATG GGTGATCTAA TTTATTTCTG GAGAAGGAAT GTAGGATGGG 60 

35 TGTTGAGAGT GACCTCCATA CCAGAGGGAC AGAGGCCAAT GTGAGTCAGA GGTGAGACTG 120 

GAACTCTCCT GCTGGATTCA CTCCAGAGCT CTGTTCTCTG GCAGGGTGAG TGGGCAGGGA 180 

TCAGCATGGG TCAACCTGTG CCTCTGCTCT CCTGACTCCA TGGAACTTTC CAGAGCAGCC 240 

40 

AACATCATTG CCAAGTCTGC AC6TTCCATA TAGGCCTGGT GTTTCTACCA CTGGACATGC 300 

TGTGGATACT GCCCATGTGA CTTCATTAGA TGTTTCCAAA TCTGTGCTTA TATCACATTG 360 

45 TCCCAAACCT GCTCAGCTCC TTATCAAATC AAAAACATTT CCATCAACTT TGTGGTCCAG 420 

GTGCCAATTC CCACCTCCTT CATATGGAAT TGCTTGCTAG ATCCTGTCAA TTCAGCATCT 480 

TTTATTATTT CAAATGTTTT TCCTCCTTCT CCTTGCACGT TTGTTCATGC CCCAAACTCT 540 

GCTTTTGCCT CCAGAAAGCC TTCCTTAGTG GAGTGAATAG GAGTGCTTGT CCTTGATTTC 600 

CTGCAATATG GA6CTCTCAA GGCAGAGAAT TTAAAAAAAT TTAAAATCAA GGAGTGTGAG 660 

55 TGTGGAGGCA GAAGCTCCAT TGTTGTATAT AATTTCTAGC TGATAAAAGA TCTT 713 

(2) INFORMATION FOR SEQ ID NO:2'7: 

(i) SEQUENCE CHARACTERISTICS: 
60 :A) LENGTH: 415 base pairs 

;3) TYPE: nucleic acid 
.C> STRANDEDNESS: single 
!D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
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<lv) ANTI-SENSB: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TTTAATGCAC TGTAGCTCCT TGGATATTAG ACCCTATATC ATATATAACA ATTTACATTT 60 

CTGAATCTTA CAAAATATAT TGCATACAGT AGGCAGTAGC AGGTAATAAG TAAAGTAACA 120 

AAAGAAAGTA TAATCAGAGT ATCTCTGCTC TGCTGACAGA TGTACAGGAA TATACTTGAA 180 

15 TATTTGACTT TGTGTGTTTT ACGTGTTAAC TTCCAGATAA GGGAATATGA TTGAATAATT 240 

TATTATTTTG AAAATACTGT ATTATGAAGC CAT6TTCATA AAGGTAAGAA AGGCAGATTC 300 

2^ TACAACTAGT CAGACAACTT AACATTCATA CTAATGACAG CTTCATTGAA ATCACTTTAC 360 

TACrCCCCTA GTAATGGAGT CATTGCATTT ATATTATACA TTATTCTCTT TTCAG 415 
(2) INFORMATION FOR SEQ ID NO: 28: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 baae pairs 

(B) TYPE: nucleic acid 

(C) 5TRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
35 (iv) ANTI-SENSE: NO 

Cix) FEATURE: 

(A) NAME/KEY: exon 
W (B) LOCATION: 1..230 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

45 TTTTGCTAGC CACCCTAATT ATCCATATTC AGATGAATAT TAACATGGAG GGCTTTGCCT 60 

GATGATGTCT ACCAGAAGGC CCTGTGTGTG GATGGTGACG CAGAGGACGT CTCTATGCCG 120 

GTGACTGGAC ATATCACCTC TACTTAAATC CGTCCTGTTT AGCGACTTCA GTCAACTACA 180 

GCTGAGTCCA TAGGCCAGAA AGACAATAAA TTTTTATCAT TTTGAAATAA 230 
(2) INFORMATION FOR SEQ ID NO: 29: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 base pairs 

(B) TYPE: nucleic acid 
{O STRANDEDNESS: single 

^ 10) TOPOLOSY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
65 (iv) AtTTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
TTGAATGTTT TCTCAAAGAT TCTTTACCTA CTCTGTTCTG TAGTGTGTGT TTTCTTCTGG 60 
CTCAGAAGTG TGTGTGTGTG TGTGTGTGCT TTCTTCTGGC TCAACAGGG 109 
(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
Uv) ANTI-SENSE: YES 



25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TTTAGCTTTA CACACTGCTG TT 22 
30 (2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
40 (iii) HYPOTHETICAL: NO 

(iv) Ain'I-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
TCCAAAGCT7 TACTTCTCGG 20 
(2) INFORMATION FOR SEQ ID NO: 32: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 base pairs 
55 3) TYPE: nucleic acid 

:C) STRANDEDNESS: single 
D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) A::TI -SENSE: YES 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
3ATGAAAAGT GGACCA 
5 2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



20 



25 



40 



50 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
^TCTGTTGGT GAAAGTTC 28 



2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 base pairs 
30 (B) TYPE: nucleic acid 

;C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

2^ (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: YES 



(xi) SIOUENCE DESCrRIPTIOK: SEQ ID NO: 34 : 



45 7CCAGCTGCC TGCGGT 



2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
ik) LENGTH: 17 base pairs 
:3) TYPE: nucleic acid 
I'C) STRANDEDNESS: single 
;D) TOPOLOGY: linear 



55 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) A:;ri-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
rrTGTACTTG AGTCCTG 17 
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iZ: INF0W4ATI0N FOR SEO ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 
fA) LENGTH: 18 base oairs 
5 :2) TYPE: nucieic acid 

:C) STRANDEDNESS: single 
12) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) AI;TI-SENSE: YES 

15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

CTCCGGTTGA AATACGGA 

i2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
(3) TYPE: nucleic acid 
CO STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

•:iii) HYPOTHETICAL; NO 

tiv) a::T1-SENSE: YES 



20 
25 
30 
35 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
CATCGTTTCST CTCGTTGAGA 

IKFORKAriON FOR SEQ ID NO: 36: 
45 :ii SEQUENCE CHARACTERISTICS: 



50 



A) LENGTH: 22 base pairs 

3) TYPE: nucleic acid 

:) STRANDEDNESS: smqie 

:r) TOPOLOGY: linear 



(ii) f-:CLECULE TYPE: DNA i genomic j 
;iii) HYPOTHETICAL: NO 

55 jiv) Arrri-SENSE: yes 



60 (xi) SEQUENCE DESCRIPTICJI: SE; 10 N0:3S: 

TCATTGTTA.-. AATAGTGGAG AT 
:2: INFOK-IATION FOR SEQ ID NC:3?: 

oS 

;i) SEQUENCE CHARACTERISTICS: 
A) LENGTH: 17 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 

S (ii) MOLECULE TYPE: DNA (genomic) 

:iii) HYPOTHETICAL: NO 

(ivJ ANTI-SENSE: YES 



15 



<xil SEQUENCE DESCRIPTION: SEQ ID NO:39: 
ATCTGAATAT GGATAAT 17 
(2) INFORMATION FOR SEQ ID NO: 40: 
20 (i) SEQUENCE CHARACTERISTICS: 



25 



(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
30 (iv) ANTI-SENSE: YES 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TTCTCGGAAC CTGGAGGAGC 
(2) INFORMATION FOR SEQ ID NO: 41: 

40 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
3} TYPE: nucleic acid 
STRANDEDNESS: single 
43 TOPOLOGY: linear 

(ii) MOLECrULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) A::7I -SENSE: YES 



55 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

(3ACACAGTAC CTTTGAAGTG 

60 (2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 
:k) LENGTH: 20 base pairs 
.2) TYPE: nucleic acid 
d5 Z) STRANDEDNESS: single 

.Di TOPOLOGY: linear 
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(ii) MOLECUtX TYPE: DNA (aenomic) 
(iii) HYPOTHETICAL: NO 
5 (iv) ANTI -SENSE: YES 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

TGGACCAAAG CTGCAGAGGT 
(2) INFORMATION FOR S£Q ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA ( genomic ) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: YES 



25 



30 

{xil SEQUENCE DESCRIPTION: SEQ ID NO:43: 
CCTCACCTGG CTGAAATAGA 20 
35 (2) INFORMATION FOR SEQ ID NO: 44: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

.:D) TOPOLOGY: linear 

Hi) r>:OLECULE TYrl: DMA i genomic I 
45 :iii) HYPOTHETICAL: NO 



50 



55 



(iv) WJTI -SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
AAGCACTCAC CTCCCAGGTG 20 
(2) INFORMATION FOR SEQ ID NO: 45: 



(i) SEQUENCE CHAPACTERISTICS: 
..A) LENGTH: 20 base oairs 
60 :3) TYPE: nucleic acia 

:C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



65 



:ii) MOLECULE TYPE: DNA (genomic) 
:iii) HYPOTHETICAL: NO 
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(iv) A!:TI-SENSE: YES 



(xij SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

3ACATTCTAC CTGCAGTTGA 20 

10 .2) iNFORIdATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 
;A) LENGTH: 19 base pairs 
?B) TYPE: nucleic acid 
1^ :C) STRANDEONESS: single 

iD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (iii) HYPOTHETICAL: NO 

(iv) A::?I-SENSE: YES 



25 
30 



40 



45 



55 



60 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CTCAAAAACC TATCAGAAA 19 
.2) INFORMATION FOR SEQ ID NO: 47: 
(i) SEQUENCE CHARACTERISTICS: 



A) LENGTH: 20 base pairs 

!3) TYPE: nucleic acid 

C) STRANDEDNESS: single 

•D) TOPOLOGY: linear 



(ii) KCLECOLE TYPE: DNA (genomic) 
:iii) HYPOTHETICAL: NO 

!iv) a::ti-sense: yes 



txi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
50 3GAAACTTAC CTATCACTGT 20 
21 INF0R:-:A7I0N for SEQ id N0:48: 



ti) SEQUENCE CHARACTERISTICS: 
A) LENGTH: 20 base pairs 
3) TYPE: nucleic acid 
Z) STRANDEDNESS: single 
Z) TOPOLOGY: linear 

vii) MOLECULE TYPE: DNA (genomic) 

:iii) HYPOTHETICAL: NO 

Uv) a::TI -SENSE: YES 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

^ GCTAGCAAAA CTGAAAAGAG 20 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 11 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



20 



45 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

25 GAGAAATATT CATTCTG 17 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5C: 
CGAGTCCTGA TAAAGCTG 18 
(2) INFORMATION FOR SEQ ID NO: 51: 



50 (i) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 17 base pairs 
(B) TYPE: nucleic acid 
;C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

fiii) HYPOTHETICAL: NO 

60 (iv) ANTI-SENSE: YES 



65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GATGAGGGTG CAAATAA 17 
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(2) INFORMATION FOR SEQ ID NO:52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



20 



30 



32 



55 
60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
GGAGTGTTAA TTAATAACAG TTT 23 



(2) INFORMATION FOR SEQ ID NO: 53: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: sinole 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: YES 



^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

CAGAGATTAC AAAAACAAT 
J2! INFORI'IATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
*n tC) STRANDEDNESS: single 

^D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
iiii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
TGCCTTTTTA CATTTTCAAT CA 
65 U) INFORT-JATION FOR SEQ ID NO: 55: 
(i) SEQUENCE CHARACTERISTICS: 



SUBSTITUTE SHEET (RULE 26) 



wo 97/11162 



PCr/CA96/00605 



.41 • 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
^ (D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

10 <iv) ANTI-SENSE: YES 



15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

ACACATAATT TAAAGGA 
(2) INFORMATION FOR SEQ ID NO: 56: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
TTAAATTATT CAAAAGG 17 
40 (2) INFORMATION FOR SEQ ID NO: 57: 



45 



55 



60 



ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: 
AAGAGAAATA TTCATTTCTG 
(2) INFORMATION FOR SEQ ID NO: 58: 



(i) SEQUENCE CHARACTERISTICS: 
^- lA) LENGTH: 20 base pairs 

05 iB) TYPE: nucleic acid 

:C) STRANDEDNESS: single 
:D) TOPOLOGY: linear 



SUBSTTTUTE SHEET (RULE 26) 



wo 97/1 1162 



PCT/CA96/00605 



-42- 

(ii) MOLECULE TYPE: DNA (genomic) 
^ (iii) HYPOTHETICAL: NO 

<iv) ANTI-SENSE: YES 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
CCCCTCCCCC CACCCCTGCA 
15 :2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
25 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
CTGCCGTGAT AATGCCCC 
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CLAIMS 
We claim: 

5 1. An isolated nucleotide sequence encoding type 5 17p-hydroxysteroid 
dehydrogenase, said sequence being sufficiently homologous to SEQ ID No. 1 or a 
complement thereof, to hybridize under stringent conditions to the coding region of 
SEQ ID No. 1 or a complement thereof and said sequence encoding an enzyme which 
catalyzes the conversion of progesterone to 20oc-hydroxyprogesterone and the 
10 conversion of 4-androstenedione to testosterone. 

2. The nucleotide sequence, as recited in claim K wherein said sequence is the 
coding region of SEQ ID No. 1 . 

IS 3. A recombinant expression vector comprising a promoter sequence and a 
nucleotide sequence in accordance with claim 1 . 

4. A recombinant expression vector comprising a promoter sequence and a 
nucleotide sequence in accordance with claim 2. 

20 

5. A recombinant host cell, transformed or transfected with the vector of claim 4. 

6. The recombinant host cell of claim 5. wherein said host cell is a eukaryotic 
cell. 

25 

7. A recombinant host cell, transformed or transfected with the vector of claim 3. 

8. The recombinant host cell of claim 7. wherein said host cell is a eukaryotic 
cell. 

30 

9. The recombinant host cell of claim 8. wherein a nucleotide sequence that 
hybridizes under stringent conditions with SEQ ID No. 1 or its complement is 
integrated into the genome of said host cell. 
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10. The recombinant host cell of claim 9. wherein said nucleotide sequence is 
located on a recombinant vector. 

5 11. The recombinant host cell, as recited in claim 8, wherein said host cell is 
capable of expressing a biologically active type 5 17p-hydroxysteroid dehydrogenase. 

12. An isolated nucleotide sequence conH)rising at least ten consecutive nucleotides 
identical to 10 consecutive nucleotides in die coding region of SEQ ID No. 1, or die 

10 complement thereof. 

13. The nucleotide sequence, as recited in claim 12, wherein said sequence 
comprises at least fifteen consecutive nucleotides identical to 15 consecutive 
nucleotides in die coding region of SEQ ID No. 1. or Uie complement tiiereof. 

15 

14. The nucleotide sequence, as recited in claim 13, wherein said sequence 
comprises at least twenty consecutive nucleotides identical to 20 consecutive 
nucleotides in the coding region of SEQ ID No. 1, or the complement dwreof. 

20 15. The nucleotide sequence, as recited in claim 13. wherein said sequence 
c<Mnprises at least Uurty consecutive nucleotides identical to 30 consecutive nucleotides 
in die coding region of SEQ ID No. 1. or die complement diereof. 

16. An oligonucleotide sequence selected from die group consisting of 
25 TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30). 

TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31), GATGAAAAGTGGACCA 

(SEQ ID No. 32). ATCTGTTGGTGAAAGTTC (SEQ ID No. 33). 

TCCAGCTGCCTGCGGT (SEQ ID No. 34), CTTGTACTTGAGTCCTG (SEQ ID 

No. 35), CTCCGGTTGAAATACGGA (SEQ ID No. 36), 
30 CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37). 

TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38). and 

ATCTGAATATGGATAAT (SEQ ID No. 39). 
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17. An oligonucleotide sequence selected from the group consisting of 



TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40). 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41). 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42), 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43). 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44). 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45), CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46), GGAAACTTACCTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). 

10 

18. An oligonucleotide sequence selected from the group consisting of 
GAGAAATATTCATTCTG (SEQ ID No. 49), CGAGTCCTGATAAAGCTG (SEQ 
ID No. 50). GATGAGGGTGCAAATAA (SEQ ID No. 51). 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52), 

15 CAGAGATTACAAAAACAAT (SEQ ID No. 53), 

TGCCTTTTTACATTTTCAATCA (SEQ ID No. 54), ACACATAATTTAAAGGA 
(SEQ ID No. 55), TTAAATTATTCAAAAGG (SEQ ID No. 56), 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57). 

CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58). and 

20 CTGCCGTGATAATGCCCC (SEQ ID No. 59). 

19. A recombinant expression vector comprising: 
a promoter sequeiKe; and 

an oligonucleotide sequence selected from the group consisting of 
25 TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30). 

TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31). GATGAAAAGTGGACCA 
(SEQ ID No. 32). ATCTGTTGGTGAAAGTTC (SEQ ID No. 33). 
TCCAGCTGCCTGCGGT (SEQ ID No. 34). CTTGTACTTGAGTCCTG (SEQ ID 
No. 35), CTCCGGTTGAAATACGGA (SEQ ID No. 36). 
30 CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37), 

TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38), and 
ATCTGAATATGGATAAT (SEQ ID No. 39). 
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20. A recombinant expression vector comprising: 
a promoter sequence; and 

an oligonucleotide sequence selected from the group consisting of 
TTCTCGGAACCTGGAGGAGC (SEQ ID No. 40), 

GACACAGTACCTTTGAAGTG (SEQ ID No. 41), 

TGGACCAAAGCTGCAGAGGT (SEQ ID No. 42), 

CCTCACCTGGCTGAAATAGA (SEQ ID No. 43), 

AAGCACTCACCTCCCAGGTG (SEQ ID No. 44), 

GACATTCTACCTGCAGTTGA (SEQ ID No. 45). CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46). GGAAACTTACCTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). 



21 . A tecombinant expression vector comprising: 
a promoter sequence; aad 

an oligonucleotide sequence selected fix)m the group consisting of 
GAGAAATA1TCATTCTG (SEQ ID No. 49), CGAGTCCTGATAAAGCTG (SEQ 
ID No. 50), GATGAGGGTGCAAATAA (SEQ ID No. 51). 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52). 

CAGAGATTACAAAAACAAT (SEQ ID No. 53), 

20 TGCCTTTTTACATTTTCAATCA (SEQ ID No. 54), ACACATAATTTAAAGGA 
(SEQ ID No. 55), TTAAATTATTCAAAAGG (SEQ ID No. 56). 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57), 

CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58). and 
CTGCCGTGATAATGCCCC (SEQ ID No. 59). 



22. A method of blocking synthesis of type 5 17P-HSD. comprising the step of: 

introducing an oligonucleotide selected frCMn the group consisting of 
TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30), 

TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31), GATGAAAAGTGGACCA 
30 (SEQ ID No. 32), ATCTGTTGGTGAAAGTTC (SEQ ID No. 33), 
TCCAGCTGCCTGCGGT (SEQ ID No. 34), CTTGTACTTGAGTCCTG (SEQ ID 
No. 35), CTCCGGTTGAAATACGGA (SEQ ID No. 36), 
CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37). 
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TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38), and 
ATCTGAATATGGATAAT (SEQ ID No. 39) into cells. 

23. A method of blocking synthesis of type 5 17p-HSD, comprising the step of: 
5 introducing an oligonucleotide selected from the group consisting of 



TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40). 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41). 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42). 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43), 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44). 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45), CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46), GGAAACTTACCTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48) into ceils. 

IS 24. A metliod of blocldng synttiesis of type S 17P-HSD. conq>rising the step of: 

introducing an oligonucleotide selected from the group consisting of 
GAGAAATATTCATTCTG (SEQ ID No. 49). CGAGTCCTGATAAAGCTG (SEQ 
ID No. 50), GATGAGGGTGCAAATAA (SEQ ID No. 51), 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52), 

20 CAGAGATTACAAAAACAAT (SEQ ID No. 53), 

TGCCTTTTTACATTTTCAATCA (SEQ ID No. 54). ACACATAATTTAAACK3A 
(SEQ ID No. 55), TTAAATTATTCAAAAGG (SEQ ID No. 56). 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57). 

CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58), and 

25 CTGCCGTGATAATGCCCC (SEQ ID No. 59) into cells. 

25. An isolated chromosomal DNA fragment which upon transcription and 
translation encodes type 5 17P-hydroxysteroid dehydrogenase and wherein said 
fragment contains nine exons and wherein said fragmem includes introns which are 16 

30 kilobase pairs in length. 

26. An isolated DNA sequence encoding type 5 17p-hydroxysteroid 
dehydrogenase, said sequence being sufficiently homologous to SEQ ID No. 3 or a 
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complement thereof, to hybridize under stringent conditions to SEQ ID No. 3, or its 
complement. 

27. A method for producing type 5 17p-hydroxysteroid dehydrogenase, comprising 
S the steps of: 

preparing a recombinant host transformed or transfected with the vector of 
claim 3; and 

cuituring said host under conditions which are conducive to the production of 
type 5 17p-hydroxy«eroid dehydrogenase by said host, 

10 

28. A method for determining the inhibitory effect of a test compound on the 
enzymatic activity of type 5 17p-hydroxysteroid dehydrogenase, comprising the steps 
of: 

providing type 5 17p-hydroxysteroid dehydrogenase; 
15 contacting said type 5 17p-hydroxysteroid dehydrogenase with said test 

compound: and thereafter 

determining thie enzymatic activity of said type 5 17P-hydroxystetoid 
dehydrogenase in the presence of said test compound. 

20 29. The method, as recited claim 28, wherein said step of determining enzymatic 
activity includes the steps of: 

adduig a substrate which is metabolized by said type 5 17p-hydroxysteix>id 
dehydrogenase; and 

determining an arooum of said substrate which is converted to metabolite. 



25 



30 



30. A method of interfering with the expression of type 5 17p-hydroxystefx>id 
dehydrogenase, comprising the stq> of administering nucleic acids substantially 
identical to at least 15 consecutive nucleotides of SEQ ID No. 1 or a complement 
thereof. 

31. A method of interfering with die synthesis of type 5 17p-hydroxysteroid 
dehydrogenase, comprising the step of administering antisense RNA complementary 
to mRNA encoded by at least 15 consecutive nucleotides of SEQ ID No. 1 or a 
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complement thereof. 

32. A method of interfering with the expression of type 5 17p-hydroxy steroid 
dehydrogenase, comprising the step of administering nucleic acids substantially 

5 identical to at least 15 consecutive nucleotides of SEQ ID No. 3 or a complement 
thereof. 

33. A method of interfering with the synthesis of type 5 17p-hydroxysieroid 
dehydrogenase, comprising the step of administering antisense RNA complementary 

10 to mRNA encoded by at least 15 consecutive nucleotides of SEQ ID No. 3 or a 
complement thereof. 

34. A method for determining the inhibitory effect of antisense nucleic acids on the 
enzymatic activity of type 5 17p-hydroxysteroid dehydrogenase, comprising the steps 

15 of: 

providing a host system capable of expressing type 5 17p-hydroxysteroid 
dehydrogenase; 

introducing said antisense nucleic acids into said host system: and thereafter 
determining the enzymatic activity of said type 5 17p-hydroxysteroid 
20 dehydrogenase. 
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